## [1] "Excluded 1 of 97 participants based on catch-trial performance."
## `summarise()` has grouped output by 'workerid', 'percentage_blue', 'modal'. You
## can override using the `.groups` argument.
## `summarise()` has grouped output by 'percentage_blue', 'modal'. You can
## override using the `.groups` argument.
## `summarise()` has grouped output by 'workerid', 'percentage_blue', 'modal'. You
## can override using the `.groups` argument.
## `summarise()` has grouped output by 'workerid', 'percentage_blue', 'modal'. You
## can override using the `.groups` argument.
## `summarise()` has grouped output by 'percentage_blue', 'modal'. You can
## override using the `.groups` argument.
We use the AUC function with the splines
method to directly compute the AUC.
t-test and regression model with control variables:
##
## Two Sample t-test
##
## data: aucs.cautious$auc_diff and aucs.confident$auc_diff
## t = 4.0994, df = 190, p-value = 6.133e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 6.49181 18.53303
## sample estimates:
## mean of x mean of y
## 19.136364 6.623944
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula:
## auc_diff ~ cond + test_order + first_speaker_type + confident_speaker +
## first_speaker_type * cond + (1 | workerid)
## Data: auc_d
##
## REML criterion at convergence: 1679.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.89840 -0.52800 -0.00835 0.61877 2.15745
##
## Random effects:
## Groups Name Variance Std.Dev.
## workerid (Intercept) 131.2 11.45
## Residual 303.2 17.41
## Number of obs: 192, groups: workerid, 96
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 12.9483 1.7170 92.0000 7.541 3.21e-11 ***
## cond1 6.2526 1.2569 94.0000 4.975 2.94e-06 ***
## test_order1 0.1573 1.7201 92.0000 0.091 0.92734
## first_speaker_type1 -4.5748 1.7216 92.0000 -2.657 0.00929 **
## confident_speaker1 1.3037 1.7186 92.0000 0.759 0.45004
## cond1:first_speaker_type1 0.1747 1.2569 94.0000 0.139 0.88976
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) cond1 tst_r1 frs__1 cnfd_1
## cond1 0.000
## test_order1 0.002 0.000
## frst_spkr_1 -0.022 0.000 -0.063
## cnfdnt_spk1 -0.022 0.000 -0.024 0.044
## cnd1:frs__1 0.000 -0.021 0.000 0.000 0.000
library(mclust)
## Package 'mclust' version 5.4.10
## Type 'citation("mclust")' for citing this R package in publications.
##
## Attaching package: 'mclust'
## The following object is masked from 'package:DescTools':
##
## BrierScore
## The following object is masked from 'package:bootstrap':
##
## diabetes
aucs_diff = merge(aucs.cautious, aucs.confident, by=c("workerid"))
aucs_diff$diff_of_diffs = aucs_diff$auc_diff.x - aucs_diff$auc_diff.y
aucs_diff %>% ggplot(aes(x=diff_of_diffs)) + geom_density() + geom_jitter(aes(y=0), width=0, height=0.001) + ggtitle("Raw data + estimated density")
1 Cluster
fit1 = Mclust(aucs_diff$diff_of_diffs, G=1)
print(summary(fit1, parameters=2))
## ----------------------------------------------------
## Gaussian finite mixture model fitted by EM algorithm
## ----------------------------------------------------
##
## Mclust X (univariate normal) model with 1 component:
##
## log-likelihood n df BIC ICL
## -442.7777 96 2 -894.6842 -894.6842
##
## Clustering table:
## 1
## 96
##
## Mixing probabilities:
## 1
## 1
##
## Means:
## [1] 12.51242
##
## Variances:
## [1] 593.8692
2 Clusters
fit2 = Mclust(aucs_diff$diff_of_diffs, G=2)
print(summary(fit2, parameters=T))
## ----------------------------------------------------
## Gaussian finite mixture model fitted by EM algorithm
## ----------------------------------------------------
##
## Mclust E (univariate, equal variance) model with 2 components:
##
## log-likelihood n df BIC ICL
## -431.9405 96 4 -882.1383 -890.0352
##
## Clustering table:
## 1 2
## 76 20
##
## Mixing probabilities:
## 1 2
## 0.7830993 0.2169007
##
## Means:
## 1 2
## 2.067593 50.222483
##
## Variances:
## 1 2
## 199.9941 199.9941
3 Clusters
fit3 = Mclust(aucs_diff$diff_of_diffs, G=3)
print(summary(fit3, parameters=T))
## ----------------------------------------------------
## Gaussian finite mixture model fitted by EM algorithm
## ----------------------------------------------------
##
## Mclust E (univariate, equal variance) model with 3 components:
##
## log-likelihood n df BIC ICL
## -431.8342 96 6 -891.0545 -952.9829
##
## Clustering table:
## 1 2 3
## 16 60 20
##
## Mixing probabilities:
## 1 2 3
## 0.2790258 0.5130602 0.2079141
##
## Means:
## 1 2 3
## -5.590971 6.656549 51.257886
##
## Variances:
## 1 2 3
## 172.7069 172.7069 172.7069
According to the Bayesian information criterion, a model with two clusters describes the data best.
Fitted model:
aucs_diff %>%
ggplot(aes(x=diff_of_diffs)) +
geom_jitter(aes(y=0, color=first_speaker_type.x), width=0, height=0.001) +
ggtitle("Raw data + Components of gaussian mixture") +
stat_function(fun = dnorm, args = list(mean = fit2$parameters$mean[1], sd = sqrt(fit2$parameters$variance$sigmasq[1]))) +
stat_function(fun = dnorm, args = list(mean = fit2$parameters$mean[2], sd = sqrt(fit2$parameters$variance$sigmasq[2])))
## Warning: Removed 101 row(s) containing missing values (geom_path).
## Generalized linear mixed model fit by maximum likelihood (Laplace
## Approximation) [glmerMod]
## Family: binomial ( logit )
## Formula: most_likely_model ~ condition + test_order + first_speaker_type +
## first_speaker_type * condition + (1 | workerid)
## Data: d.post_test
##
## AIC BIC logLik deviance df.resid
## 240.6 260.1 -114.3 228.6 186
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.8822 -0.5659 -0.3347 0.6434 2.1053
##
## Random effects:
## Groups Name Variance Std.Dev.
## workerid (Intercept) 1.724 1.313
## Number of obs: 192, groups: workerid, 96
##
## Fixed effects:
## Estimate Std. Error z value
## (Intercept) -0.4888 0.2419 -2.021
## conditioncautious -0.8466 0.2250 -3.763
## test_orderparallel 0.3204 0.2331 1.375
## first_speaker_typecautious 0.5943 0.2473 2.403
## conditioncautious:first_speaker_typecautious -0.1637 0.1873 -0.874
## Pr(>|z|)
## (Intercept) 0.043291 *
## conditioncautious 0.000168 ***
## test_orderparallel 0.169151
## first_speaker_typecautious 0.016266 *
## conditioncautious:first_speaker_typecautious 0.382287
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) cndtnc tst_rd frst__
## conditincts 0.225
## tst_rdrprll -0.081 -0.150
## frst_spkr_t -0.173 -0.268 0.043
## cndtncts:__ -0.025 0.037 -0.038 -0.002
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: likelihood_ratio ~ condition + test_order + first_speaker_type +
## first_speaker_type * condition + (1 | workerid)
## Data: d.post_test
##
## REML criterion at convergence: 2535
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.9295 -0.5890 -0.0079 0.5597 2.5589
##
## Random effects:
## Groups Name Variance Std.Dev.
## workerid (Intercept) 11806 108.7
## Residual 29251 171.0
## Number of obs: 192, groups: workerid, 96
##
## Fixed effects:
## Estimate Std. Error df
## (Intercept) -37.803 16.597 93.000
## conditioncautious -54.548 12.346 94.000
## test_orderparallel -3.505 16.625 93.000
## first_speaker_typecautious 43.118 16.629 93.000
## conditioncautious:first_speaker_typecautious -12.527 12.346 94.000
## t value Pr(>|t|)
## (Intercept) -2.278 0.0250 *
## conditioncautious -4.418 2.66e-05 ***
## test_orderparallel -0.211 0.8335
## first_speaker_typecautious 2.593 0.0111 *
## conditioncautious:first_speaker_typecautious -1.015 0.3129
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) cndtnc tst_rd frst__
## conditincts 0.000
## tst_rdrprll 0.001 0.000
## frst_spkr_t -0.021 0.000 -0.063
## cndtncts:__ 0.000 -0.021 0.000 0.000
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: likelihood_ratio ~ condition + test_order + first_speaker_type +
## prior_likelihood_ratio + first_speaker_type * condition +
## (1 | workerid)
## Data: d.post_test
##
## REML criterion at convergence: 2535.1
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.9623 -0.5553 -0.0654 0.5574 2.8424
##
## Random effects:
## Groups Name Variance Std.Dev.
## workerid (Intercept) 11385 106.7
## Residual 29251 171.0
## Number of obs: 192, groups: workerid, 96
##
## Fixed effects:
## Estimate Std. Error df
## (Intercept) -19.8854 19.9835 92.0000
## conditioncautious -54.5478 12.3457 94.0000
## test_orderparallel -2.6191 16.5023 92.0000
## first_speaker_typecautious 44.0795 16.5075 92.0000
## prior_likelihood_ratio 0.1738 0.1098 92.0000
## conditioncautious:first_speaker_typecautious -12.5267 12.3457 94.0000
## t value Pr(>|t|)
## (Intercept) -0.995 0.32230
## conditioncautious -4.418 2.66e-05 ***
## test_orderparallel -0.159 0.87424
## first_speaker_typecautious 2.670 0.00896 **
## prior_likelihood_ratio 1.582 0.11708
## conditioncautious:first_speaker_typecautious -1.015 0.31287
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) cndtnc tst_rd frst__ prr_l_
## conditincts 0.000
## tst_rdrprll 0.020 0.000
## frst_spkr_t 0.004 0.000 -0.061
## prr_lklhd_r 0.567 0.000 0.034 0.037
## cndtncts:__ 0.000 -0.021 0.000 0.000 0.000
## Data: d.post_test
## Models:
## model1: likelihood_ratio ~ condition + test_order + first_speaker_type + first_speaker_type * condition + (1 | workerid)
## model2: likelihood_ratio ~ condition + test_order + first_speaker_type + prior_likelihood_ratio + first_speaker_type * condition + (1 | workerid)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## model1 7 2585.1 2607.9 -1285.5 2571.1
## model2 8 2584.5 2610.6 -1284.2 2568.5 2.5767 1 0.1084
| workerid | first_speaker_type | test_order | cautious_count | confident_count | aligned_count | first_adaptation_speaker_count |
|---|---|---|---|---|---|---|
| 1574 | cautious | parallel | 1 | 1 | 2 | 1 |
| 1576 | confident | reverse | 1 | 1 | 2 | 1 |
| 1586 | cautious | reverse | 1 | 1 | 2 | 1 |
| 1590 | cautious | parallel | 1 | 1 | 2 | 1 |
| 1593 | confident | reverse | 1 | 1 | 2 | 1 |
| 1597 | confident | reverse | 1 | 1 | 2 | 1 |
| 1600 | cautious | parallel | 1 | 1 | 2 | 1 |
| 1602 | confident | reverse | 1 | 1 | 2 | 1 |
| 1608 | confident | parallel | 1 | 1 | 2 | 1 |
| 1609 | confident | reverse | 1 | 1 | 2 | 1 |
| 1610 | cautious | reverse | 1 | 1 | 2 | 1 |
| 1615 | cautious | parallel | 1 | 1 | 2 | 1 |
| 1619 | confident | parallel | 1 | 1 | 2 | 1 |
| 1622 | cautious | reverse | 1 | 1 | 2 | 1 |
| 1623 | confident | parallel | 1 | 1 | 2 | 1 |
| 1624 | cautious | parallel | 1 | 1 | 2 | 1 |
| 1625 | cautious | reverse | 1 | 1 | 2 | 1 |
| 1629 | confident | reverse | 1 | 1 | 2 | 1 |
| 1631 | cautious | reverse | 1 | 1 | 2 | 1 |
| 1637 | cautious | reverse | 1 | 1 | 2 | 1 |
| 1642 | cautious | parallel | 1 | 1 | 2 | 1 |
| 1644 | cautious | reverse | 1 | 1 | 2 | 1 |
| 1653 | confident | reverse | 1 | 1 | 2 | 1 |
| 1655 | cautious | reverse | 1 | 1 | 2 | 1 |
| 1656 | cautious | parallel | 1 | 1 | 2 | 1 |
| 1658 | cautious | parallel | 1 | 1 | 2 | 1 |
| 1667 | confident | parallel | 1 | 1 | 2 | 1 |
| 1668 | cautious | reverse | 1 | 1 | 2 | 1 |
| 1671 | cautious | reverse | 1 | 1 | 2 | 1 |
| 1673 | confident | parallel | 1 | 1 | 2 | 1 |
| 1674 | cautious | parallel | 1 | 1 | 2 | 1 |
| 1676 | cautious | parallel | 1 | 1 | 2 | 1 |
| workerid | first_speaker_type | test_order | cautious_count | confident_count | aligned_count | first_adaptation_speaker_count |
|---|---|---|---|---|---|---|
| 1582 | cautious | parallel | 1 | 1 | 0 | 1 |
| 1583 | cautious | parallel | 1 | 1 | 0 | 1 |
| 1596 | cautious | reverse | 1 | 1 | 0 | 1 |
| 1621 | confident | reverse | 1 | 1 | 0 | 1 |
| 1677 | confident | parallel | 1 | 1 | 0 | 1 |
## `summarise()` has grouped output by 'workerid', 'percentage_blue', 'modal'. You
## can override using the `.groups` argument.
## `summarise()` has grouped output by 'percentage_blue', 'modal'. You can
## override using the `.groups` argument.